Draft version February 9, 2010 

Preprint typeset using L£-T^X style cmulatcapj v. 11/10/09 



O 

(N 

X> 

PL, 
OS 



O 

u 

6 
to 

(N 
> 
m 

(N 



o 



X 



THE TRANSVERSE PECULIAR VELOCITY OF THE Q2237+0305 LENS GALAXY AND THE MEAN MASS 

OF ITS STARS 

Shawn Poindexter 1 , Christopher S. Kochanek 1 

Draft version February 9, 2010 

ABSTRACT 

Using 11-years of OGLE V-band photometry of Q2237+0305, we measure the transverse velocity 
of the lens galaxy and the mean mass of its stars. We can do so because, for the first time, we fully 
include the random motions of the stars in the lens galaxy in the analysis of the light curves. In 
doing so, we are also able to correctly account for the Earth's parallax motion and the rotation of 
the lens galaxy, further reducing systematic errors. We measure a lower limit on the transverse speed 
of the lens galaxy, v t > 338 km s -1 (68% confidence) and find a preferred direction to the East. The 
mean stellar mass estimate including a well-defined velocity prior is 0.12 < (M/M Q ) < 1.94 at 68% 
confidence, with a median of 0.52 Mq. We also show for the first time that analyzing subsets of a 
microlensing light curve, in this case the first and second halves of the OGLE V-band light curve, give 
mutually consistent physical results. 

Subject headings: gravitational lensing — methods: numerical — quasars: general — quasars: indi- 
vidual (Q2237+0305) — galaxies: kinematics and dynamics 



1. INTRODUCTION 

Quasar microlensing provides a unique tool for study- 
ing the properties of cosmologicall y distant lens galax - 
ies and the structure of quasars (see Wambsganss 2006). 
Each of the multiple images of the quasar passes through 
the gravitational potential of the stars along the line-of- 
sight in the lens galaxy. These stars microlens each of 
the "macro" images, so the total magnification of each 
quasar image is strongly affected by the lensing effects 
of the stars and the size of the quasar emission region. 
Since the observer, lens galaxy, stars, and source quasar 
are all moving, these magnifications change on timescales 
of 1-10 years with order unity amplitudes. 

The relevant physical scale for quasar microlensing is 
the Einstein radius projected into the source plane plane, 
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where G is the gravitational constant, c is the speed 
of light, (M) is the mean stellar mass of the stars, 
-Dls>-Dol, and Dos are the angular diameter distances 
between the lens-source, observer-lens, and observer- 
source respectively, where we have used the lens and 
source redshifts for Q223 7+0305 {z\ = 0.0394 , z s = 
1.685. iHuchra et al.l fl985l Q2237 hereafter). The ob- 
served microlensing amplitude is controlled by the ra- 
tio between the source size, R y ~ 6 x 10 15 cm (in V- 
band, see our companion paper iPoindexter fc Kochanekl 
(2009), hereafter Paper II) and Re, in the sense that 
smaller accretion disks produce larger variability ampli- 
tudes than larger disks. If a source is much larger than 
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Re, there is little change in the magnification. 

The timescale for variability is determined by the 
relative velocities of the observer, the lens, its stars, 
and the source. Genera lly, the lens motions dominate 
(jKavser fc Refsdal 119891) . leading to two characteristic 
timescales. There is the timescale to cross an Einstein 
radius, 
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and there is the timescale to cross the source, 
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where v\ ens is the expected transverse speed of the lens for 
Q2237. These timescales are also aff ected by the direc- 
tion of motion relative to the shear (Wambsganss ct al. 
1990). Variation is guaranteed on timescale £e and can 
be observed on timescale t s if the magnification pattern 
locally has structure on the scale of Ry. 

Quantitative studies using quasar microlensing have 
exploded in the last few years. Recent efforts 
have studied the relationships between accretio n 
disk size and black hole mass (iMorgan et ail [2 010). 
size a nd wavelength (lAnguita et al.l 120081: iBate et al.l 
20081: lEigenbrod et all l2008bb IPoindexter et al.l 120081 : 
Floyd et all l2009t iMosquera et al.l 12009( 1. sizes of 
non-thermal (X-ra y ) and thermal emission regions 



(IPooley et al.l 120071 iMorgan et al.l 120081: iChartas et all 
I2009t iDaietalJ I2009D. and the dark matter frac- 
tion of the lens (|Dai et al.l 120091: IPooley et all 120091: 
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iMediavilla et al.|[2009[ ) . All these studies have used static 
magnification patterns which ignore the random stellar 
motions in the lens galaxy. However, the stellar velocity 
dispersions of lens galaxies are comparable to the pe- 
culiar velocities of galaxies, and ignoring them can lead 
to biased results. For example, the average direction 
of motion of the source is the same for all images, but 
this coherence is limited by the random motions of the 
stars. With fixed patterns one must either overestimate 
the coherence, by using the same direction of motion for 
each image, or underestimat e it by using indepen dent 
directions for each image (see lKochanek et alJ l2007). In 
either case, it would be dangerous to attempt measure- 
ments that depend on this coherence, such as disk shape 
and orientation, or the transverse peculiar velocity of the 
lens. 

iKimdic fc Wambsgansi (11993ft JSchramm etafl (|1993ft . 
and lWambsganss &; Kundia ()1995ft considered the effects 
of random stellar motions and found that these mo- 
tions can also lead to shorter microlensing time scales 
because the pattern velocities of the microlensing caus- 
tics can be much higher than any physical velocities. 
As a result, measurements based on static magnifica- 
tion patterns may underestimate source sizes or mean 
masses or overestimate the transverse velocity in order 
to ma tch the effects create d by the random stellar mo- 
tions. W vithe et al.l (|200 0a'l showed that it is possible to 
statistically correct for these effects and that the veloc- 
ity correction can be up to 40% depending on the optical 
depth and shear. Another benefit to dynamic magnifi- 
cation patterns is the ability to properly account for the 
velocity of E arth around the Sun and the rotation of the 
lens galaxy ()Tuntsov et al.ll2004ft . 

Dynamic magnification patterns are also important be- 
cause they impart a well-measured physical scale to the 
patterns. All direct microlensing observations are in 

1/2 

"Einstein units" where the length scale is (M) cm. 
Determining masses, velocities, or source sizes requires 
some sort of dime nsional prio r . In our studies we have 
generally followed IKochanek (2004) and used velocity 
priors designed to mimic the combined effects of ran- 
dom and ordered motion. The reason for focusing on 
the velocity is that we know two of the contributions, 
our velocity and the stellar velocities, and the remaining 
peculiar velocities of the lens and source are truly ran- 
dom variables for which we have reasonable priors from 
cosmological models. Source sizes turn out to be lit- 
tle affected by t he choice of priors (see the discussion in 
Kochanek 2004) , but estimates of the true velocity and 
the mean stellar masses are affected. Hopefully by in- 
cluding the true random stellar motions we can further 
reduce the sensitivity of microlensing results to such pri- 
ors. 

In this paper we use microlensing to measure the pe- 
culiar velocity of a lens galaxy and the mean mass of its 
stars including the effects of the stellar motions, Earth's 
motion, and the rotation of the lens galaxy. The trans- 
verse velocity direction can be measured with microlens- 
ing because the shear sets a preferred direction for each 
image and the statistics of variability depend on the mo- 
tion relative to this axis (see Figure [T|). In theory, accu- 
rately measuring the transverse peculiar velocity of many 
galaxies over a broad range of rcdshifts could form the ba- 



sis of a new cosmological test (|GouldlU995ft . Measuring 
the mean stellar mass, including remnants, is a n inde- 
pend ent means of checking local accountings (e.g. I Gould! 
2000), which must be assembled from very disparate se- 
lection methods for high mass, low mass, evolved and 
dead, remnant stars. Moreover, doing this is possible 
in detail only for the Galaxy. While microlen sing is rel- 
ative l y insensitive to the mass function (see iPaczvnskl 
Il986t IWvithe et al.ll2000bl ). there are some prospects of 
exploring this in the future as well (e.g. | Wvithe fe; Turnerl 
[200lt [Schechter et alJl2"5ol ICongdon et al.ll200'7ft~ 

This work ex pand s on the meth o ds de scribed in 
IKochanek! (|2004ft and IKochanek et all (|2007l ) by adding 
the random stellar motions in the lens galaxy. In this 
paper we address the computational issues and then ap- 
ply this improved technique to determine the transverse 
motion of Q2337 and the mean mass of its stars. In 
Paper II, we study the shape of the accretion disk of 
the source quasar. We describe the photometric data 
in Sj2] Then we describe the Bayesian Monte Carlo 
Method and the models we use in S}3] Our results are 
presented in <2] followed by a discussion in <J51 We use 
an Om = 0.3, ft a = 0.7 flat cosmological model with 
H = 72 km s" 1 Mpc" 1 . 

2. DATA 

The quadr uply lensed z s = 1.6 95 quasar Q2237 was 
discovered by iHuchra e t al. (1985]). The images are ob- 
served through the bulge of a barred Sab lens galaxy 
at a projected distance ~ C/9 (~ 700 pc). The very 
low z\ — 0.0394 lens redshift leads to very fast lens mo- 
tions projected onto the source plane, leading to vari- 
ability timescales as short as ~ 0.2 years (Equations 
[2] and El). M i crolen sing of Q2237 was first observed 
by llrwin et al.l (|1989ft and confirmed by iCorrigan et al.l 
(1991). There ar e also detailed mass models and dy- 
namic a l studies by lSchneider et al.l (11988ft. Kent fc Falccj 
(1988). lRjxetall (11992ft .iMihoyl (12001ft . iTrott fc Webster! 
(2002). and Ivan de Yen et all (|2008ft 

We analyze nearly 11 years of the Optical Gravita- 
tional Lensing Exper iment V-band photo metric monitor- 
ing data for Q2237 (|Udalski et al J 12006(1 . To speed our 
analysis and as a cross check on the results, we divided 
the OGLE data into two separate light curves. The first 
light curve is from JD 2,450,663 to JD 2,452,621 and con- 
sists of 100 epochs and will be referred to as LCI. The 
second light curve has 230 epochs from JD 2,452,763 to 
JD 2,454,602 and will be referred to as LC2. Each light 
curve covers just over 5 years. The light curves are shown 
in Figure [2] 

Since Q2237 is expect ed to have a very small time dela y 
between its images (e.g.lWambsg anss fc Paczvnskill994T ). 
we only need to subtract the light curves (in magnitudes) 
to remove the intrinsic variability of the quasar. We es- 
timated the systematic photometric errors in the OGLE 
data using each successive triplet of epochs spanning less 
than 15 days. We used the first and last point of each 
triplet to predict the middle observation and then derived 
the systematic error that, when added in quadrature to 
the OGLE uncertainties, make the predictions consistent 
with the uncertainties. These systematic error estimates 
are 0.02, 0.03, 0.04, and 0.05 magnitudes for images A, 
B, C, and D respectively. 
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Fig. 1. — Example of a trial source trajectory (dark line segments) superposed on an instantaneous point-source magnification pattern for 
(M/Mq) = 0.3. Darker shades indicate higher magnification. An HST H-band image in the center labels the images and the corresponding 
magnification patterns. Each pattern is rotated to have the correct orientation relative to the lens. This particular LC2 trial has an effective 
lens-plane velocity of ~ 600 km s -1 Northeast. The solid disk at right has a radius of 10 17 cm on these patterns. 



3. METHODS 

Our Bayesian Monte Carlo method (|Kochanekl 12004) 
requires the construction of magnification patterns and 
a model of the quasar accretion disk. The patterns are 
convolved with the source model and used to produce 
large numbers of simulated light curves for comparison 
to the data. The results for these trial light curves are 
combined in a Bayesian analysis to measure parameters 
and their uncertainties. Here we describe the genera- 
tion of the patterns ( J3.1|) . the source model ( §3.2j) . the 
Bayesian Monte Carlo method and its priors ( §3.3|) . and 
the computational techniques needed to allow for stellar 
motion (§33|. 

3.1. Magnification Patterns 

We generate dynamic magnification patterns (see Fig- 
ure [1] for examples of instantaneous patterns) in the 
source plane by randomly placing stars near each macro 



TABLE 1 
Lens galaxy model parameters. 



Image 


K 


7 


PA [deg] 


A 


0.40 


0.40 


175 


B 


0.38 


0.39 


-39 


C 


0.73 


0.72 


70 


D 


0.62 


0.62 


-63 



Note. — The normalized surface density (k), shear (7) and its 
position angle at the location at each image. 



image in the lens galaxy. The normalized surface density 
and shear are determined by fitting models to the HST 
astrometry of the four images rela tive to the lens g alaxy 
and the mid-IR image flux ratios (|Agol et al.l f2000). We 
modeled the lens as a power law mass profile with the 
lens model program of the gravlens package (|Keetonl 
|2001[ ). Because all the images are ~ 700 pc in projected 
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Fig. 2.— The OGLE Q2237 V-band light curves. The left panel 
is LCI and the right is LC2. The rows from top to bottom are 
images A, B, C, and D. Here we show the corrected error bars. 
The red curves are one of our best fit models for the microlcnsing 
and intrinsic source variation. Because we only determine the light 
curve at the epochs with data, gaps are filled by linear interpola- 
tion. 

distance from the galactic center, we expect the surface 
density to be dominated by the stars rather than by dark 
matter = 1). This ass umption is c orrob orated by 

the microlensing an alysis of IKochanekl (120041) and the 
dynamical models of Ivan de Ven et al.l (|2008i ). The nor- 
malized surface density, k, and tidal shear, 7, from this 
model (see Table [J) were used to generate the magnifica- 
tion patterns. The lens plane is populated using a mass 
function of dN/dM oc M -13 with a dynamic range of 
M max /M m j n = 5 based on the Galactic mass function 
of IGouldl (|2000D . We use patterns with mean ma.sst 's 
of (M) = 0.01, 03, 0. 1, 0.3, 1, 3.0, and 10 M & . We 
use the IKochanekl (|2004l ) particle-particle/particle-mesh 
implementation of the inverse ray-shooting method (e.g. 
iKavser et ail I1986D to create N^ ix = 4096 2 magnifica- 
tion patterns. The shear is slightly adjusted (by at most 
l/Npix) in order to produce periodic magnification pat- 
terns that eli minate edge e ffects, as detailed in the Ap- 
pendix of IKochanekl (|2004f) . The outer scale of the pat- 
terns is 20i? E = 3.7x 10 18 (M/M Q ) 1/2 cm, which results 

in a resolution of 0.005 i? E /pixel = 9.0 x 10 14 (M /M w ) 1/2 
cm/pixel. For comparison, Mor gan et al.l(|2010f) estimate 
that the black hole mass corresponds to a gravitational 
radius of r g = GM-qu/c 2 = 2 x 10 14 cm and in Paper 
II we find a disk scale length of 6 x 10 15 cm (half-light 
radius of 1.5 x 10 16 cm. 

For the first time in any model of microlensing data, 
we fully include the random motions of the microlenses 
by using an animated sequence of magnification patterns. 
We use the measured one-dimensional velocity dispersion 
of 1 70 km s" 1 (van de V e n 2009, personal co mmunication, 
also iTrott et al.l (|2008l ); iFoltz et al.l (j 19921 ) ). We assign 



each star a random velocity as a Gaussian random de- 
viate of amplitude 0% = 170kms _1 for each coordinate 
and then generate a magnification pattern for each im- 
age/epoch combination. While binary stars make up a 
large fraction of stellar systems (e.g. iFischer fc Marcvl 
1992), they should not have a significant effect on the 
patterns. Only relatively close binaries (<C 1 AU) have 
significant orbital velocities compared to the stellar or 
bulk motions, and such separations are very small com- 
pared to the Einstein radius (1100 AU for IMq in the 
lens plane). Thus, binary motion is only significant for 
close binaries, but close binaries have separations much 
smaller than the Einstein radius or our estimated source 
sizes (66 AU for IMq in the lens plane, see Paper II) 
and would be indistinguishable from a single point mass 
on our patterns. In effect, binaries should only act like a 
shift in the mean mass of the stars. 

3.2. Disk Model 

We employ a generic thin disk model for which the 
sur face temperature scales a s T s oc i?~ 3 / 4 with radius 
R (|Shakura k. Sunvaevl 119731 ). The microlensing signal 
is primarily sensitive to the half-light radius of the disk 
(|Mortonson. Schechter fc Wambs ga.nss 2005), so the de- 
tails of the rad ial profile have limited effects. As in 
IKochanekl (l200l . we neglect the inner disk edge, since it 
should have few observed effects given disk sizes at these 
wavelengths. We define the area of the disk to be the area 
enclosed within the contour defined by kT = hc/X. We 
first parametrize the source models by choosing from 24 
different projected areas covering log 10 (area/cm 2 ) = 29.2 
to 33.8 in steps of 0.2. For each source area, we used five 
inclinations, i, with a cosi of 0.2, 0.4, 0.6, 0.8, and 1.0 
(face-on), and for each area and inclination, we used 18 
different equally spaced position angles for the major axis 
of the disk. Paper II discusses the disk model in detail. 

3.3. The Bayesian Monte Carlo Method 

We use the Bayesian Monte Carlo Method of lKochanekl 
(2004). We randomly generate light curves from the 
animated microlensing magnification patterns over the 
full range of physical parameters and source sizes and fit 
them to the observed light curves. We then use Bayes 
Theorem to infer the likelihood distribution of the pa- 
rameters given the fit statistics for the light curves. Each 
simulated light curve is defined by 

mi(t) = S(t) + m + 5pLi{t) + = S(t) + m ttot , (4) 

where S(t) is the intrinsic variability of the source, /i.; is 
the macro model magnification, 6fii(t) is the microlens- 
ing, and A/ii is the systematic magnification offset for 
each image, i. For each trial we compute the goodness 
of fit 

'm»(t) - S(t) - fn )tat 



x 2 = 



EE 



o-i(t) 



(5) 



after solving for the optimal model of the source variabil- 
ity S(t) and the magnification offsets. The parameters 
we vary in this study include the projected area of the 
disk, the inclination of the disk, the position angle of the 
disk, the effective velocity of the source, and the mean 
mass of the stars in the lens galaxy. We call these the 
physical parameters, £ p . For any combination of these 
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parameters, we also randomly select starting points on 
each of the magnification patterns and refer to these nui- 
sance parameters, £ n . 

We calculate the likelihood of the data given the pa- 
rameters as 



P(LC|£ p ,£„)ar 



N, 



dor 



X_ 
2 



(6) 



where V is the incomplete Gamma function. Ko chanekl 
(2004) justifies this form by allowing for uncertainties in 
the magnitude errors, (Ji{t) and averaging over these un- 
certainties. This ensures that the likelihood is consistent 
for trials fitting better than x 2 /dof = 1 given the formal 
uncertainties. 

The probability of the parameters given the data is 
then 

P(£ p ,£ n \LC) ex P(£C|£ p ,£ n )P(£ p )P(£ n ), (7) 

where P(£ p ) and P(£ n ) = 1 ar e the prior probability 
distributions of the physical and nuisance parameters. 
Since we are analyzing two separate parts of the same 
light curves, we combine the results to improve our mea- 
surements by multiplying the probabilities for each light 
curve and then applying the priors, 

P(£ P \LC1, LC2) cx P(LCl\t P )P(LC2\t p )P(Z P ). (8) 

We did this for two reasons. First, it becomes (proba- 
bly exponentially) harder to find good fits to longer and 
longer light curves. Second, analyzing the curves sepa- 
rately allows us to study whether different light curves for 
the same object give the same answers. The price is that 
analyzing them separately and then combining them will 
have less statistical power than a simultaneous analysis 
of all the data. We compute the probability distributions 
by marginalizing over the nuisance variables 



P(£ P |LC)cx J P(£ p ,( n \LC)d4 a . 



(9) 



We compute this as a Monte Carlo integration over the 
trial light curves, which should converge to the true in- 
tegral if we generate enough simulated light curves. 

For each source size, inclination, and disk position an- 
gle we must first convolve the magnification pattern with 
the source model. Then we produce trial light curves by 
choosing random starting points and velocities across the 
animated sequence of magnification patterns. In addition 
to the random motions of the stars, we must also assign 
bulk velocities to the observer, v Q , lens, vj, and source, 
v s , leading to an effective (source-plane) velocity of 
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(e.g. iKavser et al.l I1986D that is dominated by the lens 
velocity, vj, in the case of Q2237. From t he projec- 
tion of the CMB dipole (jHinshaw et al.ll2009D . we know 
that v = (—50,— 23) km s" 1 East and North respec- 
tively for Q2337. Based on iTinker. Wetzel, fc Zehavil 
(2009) we estimate that the (ID) rms peculiar veloci- 
ties of the lens and source are <7i ens = 327 km s -1 and 
<r src = 230 km s" 1 , respectively. For the calculations, we 
randomly draw each effective velocity coordinate (in the 



lens plane) from a one dimensional Gaussian distribu- 
tion with a = 1000 km s -1 and then re- weight the trials 
to a more physical range when we carry out the Bayesian 
integrals. 

As discussed in the introduction, quasar microlensing 
is subject to a degeneracy between mean stellar mass, 
effective velocity, and accretion disk size. Including ran- 
dom stellar motions partially breaks this degeneracy by 
introducing a physical scale to the magnification pat- 
terns. We still need a prior on one o f these par a meter s 
to make useful measurements. As in iKochanekl (|2004|) . 
we apply a velocity prior. Here we define our prior in the 
lens-plane, since the lens motion dominates the effective 
velocity. In the absence of any "streaming velocities" , we 
can determine the peculiar velocities only up to a 180° 
degeneracy that corresponds to a time reversal symme- 
try given that the peculiar velocity priors depend on the 
speed but not the direction of motion. Our velocity prior 
in the lens-plane is 



P(vi) cx exp — 



(vi - V C MB,l)' 
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where vcmb.i is our CMB motion projected onto the lens- 
plane, and the expected dispersion in the lens-plane is 
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The very high projected motion of the lens due to the 
very low lens redshift means that the source motion is 
unimportant even though cr src ~ u\ cns - "Streaming ve- 
locities", such as our motion relative to the CMB, our 
orbit around the Earth (parallax effect), or rotational 
velocities in the lens break this degeneracy. These mo- 
tions (up to ~ 10% of the peculiar velocties), do slightly 
break this degeneracy. 

With the stars moving it is also makes sense to include 
the effects of the Earth's motion and the small rotation 
velocity of the lens galaxy as part of the motions across 
the animated patterns. Aside from lTuntsov et all (2004) 
the motion of the Earth has not previously been included 
in a quasar microlensing calculation. Earth's motion pro- 
jected onto the lens plane is approximately 10% of the 
expected transverse velocity of the lens motion (the dom- 
inant motion of the system). It is also ~ 20% of the 
minimum possible velocity scale set by the random stel- 
lar motions. In trials with (M/Mq) = 0.3, we found that 
including parallax increased the total probability of all 
trials by ~ 20%, and reversing the Earth's motion re- 
duced the probability by a similar amount. Earth's orbit 
is trivial to include and computationally inexpensive, so 
we include it in our standard calculations even though it 
has modest effects on the likelihood. 

The lens is an late type galaxy with rotation in the 
plane of the sky of ~ SSkms^ 1 for images A and B , 
and ~ 20kms _:t for images C and D <|Trott et al.ll2008ft . 
The position angle of these rotation velocities are 84.5°, 
— 129°, —20°, and 153° (north through east) respectively 
for images A, B, C, and D. These are relatively low com- 
pared to the disk because the images lie in the bulge. The 
velocities for images A and B are greater than the mod- 
ulations introduced by the Earth's orbit, so we include 
them in the simulation. 
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3.4. Computational Techniques 

The Monte Carlo method requires simultaneous ran- 
dom access to every magnification pattern for each im- 
age and epoch. With the stars moving, this means we 
need 400 and 920 patterns for LCI and LC2 respectively, 
corresponding to 25 and 57.5 gigabytes of storage for 
4096 2 patterns instead of the 4 patterns and 0.25 giga- 
bytes needed for stationary stars. This is more memory 
than is generally available on any one machine. 

Our first step towards solving this computational prob- 
lem is to conserve memory by compressing the "gray" 
scale of the convolved patterns. Normally we store pat- 
terns as a 4096 2 array of 4-byte floating point num- 
bers. However, magnification patterns typically span a 
dynamic range of 10 magnitudes magnitudes even for the 
smallest source sizes, and the data uncertainties are no 
smaller than 0.01 magnitudes. Thus we only need a log- 
arithmic dynamic range of 10/0.01 = 1000 rather than 
the 2 32 dynamic range of a floating point variable. For 
example, if we use 16 bits for each pixel, which is more 
than sufficient given the dynamic range of the data, we 
can pack 4 pixels into one 64 bit word rather than using 
128 bits, which not only compresses the data by a factor 
of 2 but also has advantages for data transfer speeds. In 
practice, we adjust the compression level for each mag- 
nification pattern and source size to have a resolution at 
least ten times better than the uncertainty in the corre- 
sponding data point. We achieve compression ratios of 
2.5 to 3 for the OGLE data. 

Even after compression, the full collection of magnifi- 
cation patterns is still too large for most single machines, 
so we distribute them evenly among parallel comput- 
ers. This has the added benefit of utilizing additional 
CPUs, but at the cost of needing to communicate be- 
tween nodes. Our goal is to minimize the need for this 
communication. We sort each light curve in chronologi- 
cal order and then distribute the epochs in a round robin 
fashion to each node, so that each node has a sparse but 
complete representation of the data. Trial light curves 
are started on the individual nodes. If a trial's x 2 ex- 
ceeds a threshold it is simply discarded. If it is under the 
threshold, it is passed to other nodes to be tested against 
the rest of the light curve. This basically amounts to a 
low resolution pre-search for good-fitting light curves be- 
fore doing any expensive communication with the other 
nodes. These light curves are optimized by exploring 
slightly different starting points and velocities across the 
magnification patterns. This requires the master node 
for each trial to do many communications with the other 
nodes to compute the full x 2 > but our tests show that 
this finds good fits faster than trying more light curves. 

For each source model, we choose 10 5 starting points 
and velocities for one of the four images. Then we search 
for pairwise matches by trying 10 4 starting points on 
each of the other three images and keep the best match 
for each image. A light curve is then produced from this 
velocity and starting point for each image. The \ 2 f° r 
each trial light curve is computed from the data. If the 
X 2 of a trial exceeds a threshold during its calculation, 
we discard it immediately. Such poor solutions will make 
no contribution to the Bayesian integrals (Equation |9j, 
so there is no point in wasting further calculation or com- 
munication on completing the trial. 



LCI was processed in 1.6 CPU-years and LC2 was pro- 
cessed in 2.8 CPU-years utilizing 16 AMD Opteron ma- 
chines (64 processor cores) simultaneously at the Ohio 
Supercomputing Center. In total we tried 9 x 10 14 unique 
starting points and 3 x 10 9 different velocities. We found 
significantly fewer good fitting trials for LCI, so our 
threshold for saving trials was x 2 /dof < 4 for LCI and 
X 2 /dof < 2.5 for LC2. Our best fit simulated light curves 
have x 2 /dof = 0.86 for LCI, and x 2 /dof = 0.99 for LC2. 
There was a large event in image C, and more rapid mag- 
nification changes in LCI as compared to LC2 (see Figure 
[2]), and this likely explains why it was harder to find good 
fits for LCI. With these cuts, 3 x 10 6 and 6 x 10 6 trials fits 
passed the cuts for LCI and LC2 and were saved. Even 
though the best fitting light curves produced x 2 /dof ~ 1, 
we rescaled the x 2 f° r each light curve to produce better- 
defined results in the Monte Carlo integral (Equation [9]) 
for each parameter. We divided the x 2 °f trials of LCI 
and LC2 by 1.9 and 1.4 respectively for our final analysis, 
so that 10 trials for each set were less than x 2 /dof after 
rescaling. In general, this is conservative and broadens 
the parameter uncertainties by y/1.9 and y/lA over what 
we should achieve with an infinite number of trials. 

4. RESULTS 

Here we estimate the transverse peculiar velocity of the 
lens galaxy, the mean mass of its stars, and the mean 
magnification offsets defined in Equation 2) We quote 
the results from the combined analysis of LCI and LC2, 
but also show the results from the independent analyzes 
of LCI and LC2 in Figures |3] gj and M Our main 
results are based on the a — 327 kms -1 lens-plane (ID) 
peculiar velocity prior described in £13.31 and assume that 
all the light comes directly from the accretion disk. We 
verify the latter assumption and examine the structure 
and orientation of the accretion disk in Paper II. 

4.1. Transverse velocity 

Figure [3] shows the likelihood contours for the trans- 
verse velocity of the lens galaxy. The peak likelihood is 
~ 300kms _1 East. The individual light curve results 
are very consistent with the joint analysis, as shown by 
superposing the 68% contours for LCI and LC2 in Fig- 
ure [3J This agreement can also be seen in the position 
angle of the lens motion (Figure 0| and in the lens speed 
(Figure [5]) . 

After integrating over direction we find a trans- 
verse speed of 438^13 kms" 1 (438^5 km s" 1 ) at 68% 
(95%) confidence (Figure O including our standard prior 
(Equations [TT1 and [T2"|) . The inclusion of a physical model 
of the stellar motions does not completely eliminate de- 
generacies, and our speed estimate is dominated by our 
velocity prior (Equation lll|) at large speeds. If we instead 
use the broader lens-plane prior with a = 1000 kms -1 
from which we derive our trials, the measured speed be- 
comes 1048l4gg kms -1 at 68% confidence basically fol- 
lowing the prior. Therefore, we only have measured a 
lower limit v t > 338 kms -1 . Fixing the mean microlens- 
ing mass has little effect, provided the mass is sufficiently 
large. If we fix the mean mass to be (M) = 0.3M Q we 
find that v t < 486 (757) kms" 1 at 68% (95%) confidence. 

These resu lts are consistent with earlier results. 

iWvithe et all (fl999h found a 95% upper limit on the 
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Fig. 3. — Probability distribution for the effective lens-plane 
velocity. The dark solid contours enclose 25%, 68%, and 95% of 
the likelihood relative to the peak. Because the motion is strongly 
dominated by the lens, this is essentially the transverse peculiar 
velocity of the lens gal axy. The green dashed contours are the 
velocity prior (Equation [TT} , drawn at the same levels. The small, 
red, dotted circle is the 68% contour for the contribution to the 
prior from the expected motion of the source. The black point is 
our CMB motion projected onto the lens-plane. The 68% enclosed 
probability for the LCI and LC2 analyses are shown as dotted blue 
and dashed red contours. 
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Fig. 4. — Position angle of the effective velocity in the lens-plane. 
The dotted blue (dashed red) curve shows the results from only the 
LCI (LC2) analysis. 
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Fig. 5. — Speed distribution of the effective velocity of the source 
in the lens-plane (dark solid curve). This motion is dominated by 
the priors on the transverse peculiar velocit y of the lens. The 
green dashed curve shows the prior (Equation 1116 integrated over 
directio n. The upper limits from other studies are indicated with 
arrows I Wyithe et alj[l99St lUiLMerino et aT|[20(55T) . The dotted 
blue (dashed red) curve shows the results from only the LCI (LC2) 
analysis. 

transverse lens speed, v t < 500 km s" 1 by using deriva- 
tives of the microlensing light curve from patterns pro- 
duced from a Salpeter mass function including stellar 
motions and a range of source sizes. They tried three 
different mass models with (M/Mq) = 0.22,0.31 and 
1.0 and found that the estimated transverse speed scaled 
with yj (M) . Using the distribution of gaps between 
high magnification events, bu t not including stellar mo- 
tions, [GiLMirinoiEaD (120051 ) found a 90% upper limit, 
tk < 630(2160) kms" 1 for M = 0.1(1.0)M Q lenses. It 
must be noted that neither study included the full range 
of physical uncertainties we include here. 

4.2. Mass 

We measure the mean stellar mass in the lens to be 
0.12 < (M/Mq) < 1.94 (0.04 < (M/Mq) < 3.46) at 
68% (95%) confidence with a median of (M/Mq) = 0.52. 
This is generally consistent with earlier estimates for this 
lens based on less data and simpler analyses including 
fewer of the physical uncertainties. It is margi n ally c on- 
sistent with the earlier estimate by IKochanekl ()2004D of 

{M/Mq) = 0.018±g;gfJ (0.018±8;j£?) at a 68% (95%) 
confidence using a similar velocity prior but with less 
data and static stars. Our uncertainti e s are a fac- 
tor ~ 2 times smaller. iLewis fc Irwinl (|1996l) argue 
for a mean mass of 0.1 < M/Mq < 10 by com- 
paring the observed magnification probability distribu- 
tion to that of sim ulations that did not include ran- 
dom stellar motions. iWvithe et al.l (|2000bD estimate that 
(M/Mq) = 0.29 with a lower limit of (M/Mq) > 0.11 
at 99% confidence by analyzing the distribution of light 
curve derivatives and assuming a ID stellar dispersion of 
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Fig. 6. — Mean mass of the stars. The dotted blue (dashed red) 
curve shows the results from only the LCI (LC2) analysis. 




001 I / 

200 400 600 800 1000 



Lens Speed [km/s] 

Fig. 7. — Mean mass, (M) of the stars versus the speed of the 
lens, v. The dashed line isMati 2 . 

165 km s -1 combined with a prior on the transverse ve- 
locity. iGil-Merino fc Lewisl (|2005l ) simply argued that 
the masses must be grea ter than Jupiter -like objects, 
contrary to the claims of iLee et al.l (|2005l ). In our re- 
sults, the mass estimate is still strongly affected by 
our velocity prior (Equation [TT]). If we use the broad 
a = 1000 km s -1 lens-plane velocity prior, the median 
rises to (M/M©) = 1.5, and we would need to expand 
the calculations to higher masses to fully sample the mass 



Mass (M Q ) 

Fig. 8. — The most probable (M) = 0.3Mq microlensing mass 
function used in the cal culations ( dotte d c urve) as compared to 
that predicted from the ICh abricr ( 20031) or IKroupa et al.l l|1993l ) 
initial mass functions truncated at masses lower than O.OIMq 
(mean mass 0.20AfQ) and O.O8M0 (mean mass 0.32Mq), respec- 
tively, for an age of 10 Gyr. The features due to the main sequence 
(MS), white dwar fs (WD), neutron star s (NS) and black hole s (BH) 
are marked. The Chabricr (2003) and Kroupa ct al. (1993) mass 
functions are slightly offset to make the different amplitudes for 
neutron stars and black holes visible. The assignment of a 5Mq 
mass for black holes is arbitrary but not important. 

distribution. The problem for accurate mass measure- 
ments is that (M) ex. v 2 (see Figure [7|), so the mean mass 
is very sensitive to the speed distribution (see Figure [7j) . 
Including the mean stellar motions eliminates low veloc- 
ity solutions corresponding to low masses, but cannot 
eliminate high mass, high velocity solutions. 

We can compare this mass estimate to that ex- 
pected from stellar mass functions. We approximate 
the lifetimes of stars by the time to reach the base of 
the giant branch, u sing the approximate expression in 
IHurlev et al.l (l2000l) . The remaining lifetimes beyond 
this point are an unimportant correction. Stars older 
than this lifetime are modeled as remnants using the 
white dwarf initial/final mass re lation A%d = 0.109M+ 
0.394M Q of lKalirai etall pOOl for M NS < 8M Q , a neu- 
tron star mass of 1.35M Q for 8M Q < M < 40M Q , and 
a black hole mass of 5M for masse s Mrh > 40Mq. 
Using the initial mass function from iChabrieil ([2003), 
a combination of a log-normal distribution at low mass 
and a power-law at high mass covering 0.01M Q < M < 
lOOM©, the mean mass is 

(M) ~ (0.20 + 0.03 log (t/W Gyr))M (13) 

for any reasonable population age t. Age has little effect 
because the high mass stars which evolve on these time 
scales make a limited contribution to the mean mass, and 
the mass scale beyond which stars have evolved depends 
weakly on age JM evolve ~ (f/11 Gyr) a31 M© for the 
IHurlev et a l. (2000) models). Chan ges in the white dwarf 
mass relations also have little effect. For Mwd = aM+b, 
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the sensitivity is S (M) ~ O.O8<5oM + 0.05<5&, where 
iKalirai etT al. ( 2008) estimate uncertainties of 8a ~ 0.007 
and Sb ~ 0.025A/©. Similarly, changes in the masses 
of neutron stars and black holes have negligible effects 
on the mean mass, with 6(M) ~ 0.0035AA/ns and 
~ 0.00034AMbh, respectively, and the same holds true 
for the masses defining the boundaries between rem- 
nant types. Even giving all stars a binary companion 
with the secondary mass ratio uniformly distributed from 
1/50 < M2/M1 < 1 affects the mean mass little, roughly 
0.05M Q . The sense of the effect depends on the size 
distribution of the binaries. Very wide binaries act like 
independent stars and so lower the mean, while very close 
binaries act like a single, higher mass star and hence raise 
the (effective) mean. 

Thus, only changes in the actual initial mass func- 
tion can s i gnific antly alter the expected mean mass. The 
Chabricr (2003]) mass function converges to low masses, 
so extending the mass range downwards to O.OOlAf© from 
O.OlAf© reduces the mean mass by only 0.02M Q . Signif- 
icant changes require a mass function converging more 
slowly at lower masses or adding entirel y new popula- 
tions. For example, if we instead use a iKroupa et al.l 
(1993) mass function, the results are very sensitive to 
the low mass cutoff because the mass function is a ris- 
ing power law (oc Af~ 13 ) to low masses. For mini- 
mum masses of 0.08M Q and O.OlAf© the mean masses 
are 0.32Af Q and 0.15M© respectively. Figure [8] com- 
pares these mass functions to our model simple power 
law model with (M) = 0.3M Q and Af max /Af min = 50, 
to show that our maximum likelihood model is in good 
agreement with expectations for normal stellar popula- 
tions. 

We can also compare this mean mass to microlens- 
ing measurements made in our own Galaxy and in other 
quasar microlensing studies of other lenses. The MAssive 
Compact Halo Object (MACHO) survey measured the 
most likely mass range of compact objects in the Milky 
Way Halo t o be 0.15 < M/My < 0.9 depending on the 
halo model lAlcock et all (|2000( ). although these results 
are broadly questioned (e .g. see iTisserand et al.1 120071 : 
iWvrzvkowski et all 120091 ). Estimates for the Galac- 
tic bulg e are probably more relevant for comparison to 
Q2237. lHan fc Gould (|1996l) determined that a power 
law mass function dN/dM cx M~ 21 for M > 0.04Af Q 
was the best fit to a sample of 51 MACHO Galac- 
tic bulge microlensing events. This corresponds to a 
mean mass of 0.19M(p assum ing a maximum mass of 
10Af s . iGrenacher et al.1 (|1999f l studied the first 41 MA- 
CHO bulge events toward Baade's windows and found 
a mean mass of 0.09M© (0.129) for bulge (disk) lenses. 
They assumed a Salpeter mass function in the range 1- 
10 Mq and fi t for the best slope and m inimum mass 
below 1M Q . iCalchi Novati et al.1 (|2008ft found a very 
similar result. Outside our Galaxy, the only limits aside 
from those for Q2237 are those for the doubly i mage d 
quasar Q 0957+561 bv iSchmidt fc Wambsganssl (| 19981 ). 
who found a weak lower limit of (M) > O.OOlAf©. 

4.3. Magnification Offsets 

We can also try to measure the relative mean mag- 
nification offsets between each of the images. In our 
models we do not constrain the mean magnification 
ratios of the images to closely match the predictions 
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Fig. 9. — Differential magnification offsets , Afij for A— B, A — C, 
and A— D. The extinction measurements by Agol et al. (2009) are 
shown relative to image A. The dashed curve is the prior we used 
on the magnification offsets. 
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nification, and contamination of the light curves by light 
from the lens or host galaxy can also change the relative 
brightnesses of the images. We allow A/Zj (Equation 01 
to be optimized for each fit, subject to a Gaussian prior 
with a 1.0 magnitude dispersion. In Figure |H] we show 
the posterior probability distributions for these differ- 
ential offsets. For an infinitely long light curve, these 
offsets will converge to zero in the absence of any sys- 
tematic problems. The differential offsets between A— B 
and A— D show weak evidence for offsets, but there is 
surprisin gly little converge nce in their values. For com- 
parison, lAgol et ail (|2009t ) used the flux ratios of the 
quasar broad lin e s as co mpared to the continuum from 
lEigenbrod et al.l (|2008aD to estimate the extinction of 
the images relative to A. They found AE(B — V) = 
0.02 ±0.05, 0.10 ±0.04, and 0.18 ±0.03 for images B, C, 
and D respectively. Figure [9] shows these estimates as- 
suming a Ry = 3.1 extinction curve. There is some cor- 
relation between our estimates and these shifts, but our 
estimates are simply too uncertain to draw any conclu- 
sions. We experimented wit h forcing our trial s to match 
the extinction estimates of lAgol et al.l i|2l)l)9i by multi- 
plying the probability of each trial by a Gaussian model 
of these extinction estimates. We found no significant 
influen ce on any other parameter distribution. iDai et aLl 
(2009) reached a similar conclusion in their analysis for 
RXJ1131-1231. 

5. DISCUSSION 

By including the random motions of the stars, we can 
now use microlensing to study the peculiar velocity of the 
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lens galaxy and to estimate mean stellar masses and po- 
tentially the stellar mass functions with fewer systematic 
uncertainties. In particular, we find a clear preference 
for the direction of motion of the lens galaxy. In fact, 
as we use a less restrictive velocity prior, the direction 
of motion is better constrained since faster speeds are 
allowed. We cannot however, determine the speed with- 
out additional priors. If we assume a mean stellar mass 
of (M) = 0.3M©, we find that the peculiar velocity is 
v t < 48 6 km s -1 which is c o nsistent with the other esti- 
mates (|Wvithe et all 119991: iGil-Merino et all I2005D but 
more fully includes all the physical uncertainties. It 
should not be surprising that we can determine a dimen- 
sionless quantity, the direction of motion, better than the 
dimensional speed, given the basic problem of microlens- 
ing that all observables are in y{M) cm. Very roughly 
(see Figure[T|), the preferred direction has images A and B 
moving more closely to perpendicular to the ridges of the 
magnification patterns created by the shear and images 
C and D are moving more parallel to the shear direction. 
This is consistent with the variability of A/B compared 
to C/D. We included the parallax effects of the Earth's 
orbit, and the results weakly favored its inclusion. 

We had hoped that modeling the random motions 
would be more of a help in breaking these degeneracies 
by setting a physical scale. This is probably true for 
low mean masses (M). For fixed variability amplitudes, 
reproducing the light curves with a low mean mass re- 
quires small physical velocities, while high mean masses 
require high velocities. Adding the stellar motions at 
their observed dispersion eliminates low mass solutions 
independent of the unknown peculiar velocities by set- 
ting a floor to the velocity scale. High mass solutions 
need peculiar velocities, a, that are larger than the stel- 
lar motions, <7*, and so are only constrained by the pri- 
ors on the peculiar velocities. Essentially, the dynamic 
patterns act like static patterns once cr* < a, and we re- 
cover the familiar degeneracies of static patterns. Thus, 
our correct treatment of the stellar motions constrains 
low mass but not high mass solutions in the absence of a 
peculiar ve locity prior. With a well-defined cosmological 
prior on a ([Tinker. Wetzel, fc Zehavi20"09l) , wc find that 
0.12 < {M/Mq) < 1.94 at 68% confidence, demonstrat- 
ing that the microlensing objects are typical of stellar 
populations and their remnants. This mass range is con- 
sistent with expectations for normal stellar populations 
(see £|4.2p . but not tightly constraining. 

We largely ignore the macro magnifications predicted 
by the mass distribution of the lens galaxy in our calcula- 
tions because of their systematic uncertainties. However 
some recent studies have made use of this information by 
analyzing image pairs straddling a cr itical curve which 
should have the same magnification ()Flovd et al.l 120091 : 
Bat e et al.l 12008). A concern is that the macro mag- 
nification may be affected by undetected substructure, 
differential extinction, or contamination by the lens or 
host galaxy. In our standard analysis we use the AC 
signal and largely discard the DC signal by not tightly 
constraining the mean magnification. Given sufficiently 
long light curves, the results will converge to the true 
magnification offsets. Even for Q2237, with its decade 
long OGLE light curves, the data are not sampling long 
enough paths across the patterns (see Figure Q} to show 



convergence. At present, the distribution of differential 
mean magnification offsets are too broad (Figure [9]) to 
tightly constrain any systematic magnification offsets. 
Fortunately, our results for the other physical parame- 
ters are little affected by whether we allow these offsets 
to vary or constrain them with the extinction estimates 
of lAeol et al.l (I2009T) . 

Finally, we show for the first time that microlensing 
variability in a lens gives the same results when analyzing 
different portions of its light curve. The analysis of light 
curves LCI and LC2, corresponding to the 1st and 2nd 
halves of the 11 year OGLE monitoring period, lead to 
statistically consistent distributions for every parameter 
we consider. This both confirms our ability to measure 
parameters and gives us tighter constraints after combin- 
ing the results. It would be computationally challenging 
to analyze the full light curve simultaneously because 
it becomes (exponentially?) harder to fit longer light 
curves. However, such full analyses are likely needed for 
some quantities, particularly the magnification offsets, to 
converge. 

In the future we will likely include binaries, even 
though their effect is not likely to influence the results 
other than interpreting the meaning of the mean stellar 
mass (by up to O.O5M , as discussed in However, 
like the projection of our motion relative to the CMB, the 
streaming velocities in Q2237 are small compared to the 
peculiar velocities, and so are do little to break the de- 
generacy. The effects of streaming velocities will be seen 
most strongly in true disk lenses ( none are known, e x- 
cept, potentially PMN J2004-1349. lWinn et al.l(l200l) 
or in lenses such as Q J0158— 4325 (see iMorgan et al.1 
(2008) for a microlensing analysis of this active system) 
lying close to the equator of the CMB di pole, which will 
have the full 369 km s -1 dipole motion (Hinshaw et al. 
2009). These CMB equatorial lenses should also show 
significantly shorter microlensing variability time scales. 
Detecting this effect would be an independent confirma- 
tion of the kinematic origin of the dipole. 

Q2237 was a natural first candidate for a full anal- 
ysis with moving stars because of the excellent OGLE 
data, short microlensing timescales, and negligible time 
delays between the images. However, there is no prob- 
lem extending our approach to analyzing microlensing 
data with moving stars to any other microlensing anal- 
ysis. Even if the time delays are unknown, cases with 
different trial delays could simply be tried sequentially 
([Morgan et al.l 120081 ) . Moreover our method can eas- 
ily be extended to multi-wavelength data sets to exam- 
ine the structure of the accreti on disk varies with wave- 
length (jPoindexter et al.ll2008D . The memory require- 
ment s would be too great to fi t each band simultaneously 
as m iPoindexter et al.1 (120081), but we can use a modified 
version of the method iDai et ah (2009) applied to the 
joint optical and X-ray analysis of RXJ1131— 1231. The 
models are first run on the band with the most epochs. 
As good fitting trials are found, the starting points, ve- 
locities, and x 2 matrix are saved. Next, for each succes- 
sive band, we recompute the light curves corresponding 
to the epochs and source sizes of the other wavelengths, 
and the results of these new fits are used to continue 
the x 2 calculation. Since the overall execution times are 
only modestly longer than using static stars, there is no 
reason not to use this more physically correct approach. 
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